RDA - Genomic offset predictions

Published

January 8, 2025

Most analyses conducted in this document are based on Capblancq and Forester (2021) and the associated Github repository.

Methodological comment

Two methods can be used to generate predictions based on the RDA models: the “predict” and the “loadings” methods.

The “predict” method uses the linear combinations of predictors identified for each underlying linear model (e.g., gen ~ env) to predict the genotype or allele frequency of each SNP under present and future environmental conditions. It directly applies the fitted model, making it generally more reliable for predictions. In contrast, the “loadings” method uses the scores of individuals or sites in the RDA space to compute weighted averages for each environmental variable, estimating their contribution based on these averaged scores. While the two methods can produce similar results if the model is well-fitted, discrepancies may arise if individuals or sites are not perfectly positioned in the RDA space due to unaccounted factors. Such differences are more likely when the model explains only a small proportion of the variance, introducing noise into the weighted averages.

In our study, we use the “predict” method to generate the predictions.

Code
# Do we use "predict" or "loadings" in the RDA?
rda_method <- "predict"

1 Data

Genomic data

Code
# Population-based allele frequencies
# ===================================
geno <- read.csv(here("data/DryadRepo/ImputedGenomicData_AlleleFrequencies_withoutmaf.csv"),
                     row.names = 1)

# SNP sets
# ========
snp_sets <- readRDS(here("outputs/list_snp_sets.rds"))

Climatic data

Code
# Set of climatic variables
# =========================
clim_var <- readRDS(here("data/ClimaticData/NamesSelectedVariables.rds"))

# Climatic data
# =============
# we scale the past and future climatic data at the location of the populations
# with the parameters (mean and variance) of the past climatic data.
source(here("scripts/functions/generate_scaled_clim_datasets.R"))
clim_dfs <- generate_scaled_clim_datasets(clim_var,clim_ref_adj = FALSE)

Population structure

We used the first three PCs of the PCA based on the genomic data and estimated in report 4_ReduncancyAnalysis_VariancePartionning_IdentificationCandidateSNPs.

Code
PCs <- readRDS(here("outputs/RDA/RDA_explanatorydataframes_PopLevel.rds"))[[2]] %>% 
  dplyr::select(pop, contains("PC"))

2 RDA models

2.1 Run the models

We run the RDA models with or without correction for population structure.

Code
runRDA <- function(snp_set, clim_var, clim_df, geno, pop_structure){
  
# Keep the genomic data of the selected SNPs
geno <- geno[,snp_set]
  
# RDA formula  
if(pop_structure == F){
  form_rda <- paste("geno ~ ", paste(clim_var,collapse= " + ")) 
  } else {
  form_rda <- paste("geno ~ ", paste(clim_var,collapse= " + "), " + Condition (PC1 + PC2 + PC3)") 
  }
  
# run RDA
rda_outliers <- rda(as.formula(form_rda), clim_df)

}
Code
snp_sets <- lapply(snp_sets, function(x) {
  
x$rda_model_uncorrected <- runRDA(snp_set = x$set_snps,
                                  clim_var = clim_var,
                                  clim_df = clim_dfs$clim_ref,
                                  geno = geno,
                                  pop_structure = F)

x$rda_model_corrected <- runRDA(snp_set = x$set_snps,
                                  clim_var = clim_var,
                                  clim_df = clim_dfs$clim_ref %>% inner_join(PCs, by="pop"),
                                  geno = geno,
                                  pop_structure = T)

return(x)

}) %>% setNames(names(snp_sets))

2.2 RDA biplots

An RDA biplot can be used to visualize the relationship between the selected SNPs and the underlying climatic variables.

Code
make_RDAbiplot <- function(x, pop_structure) {

if(pop_structure == T){
  selected_mod <- x$rda_model_corrected
} else {
  selected_mod <- x$rda_model_uncorrected
}
TAB_loci <- as.data.frame(scores(selected_mod, choices=c(1:2), display="species", scaling="none"))
TAB_var <- as.data.frame(scores(selected_mod, choices=c(1:2), display="bp"))

eigenvalues <- summary(eigenvals(selected_mod, model = "constrained")) %>% as.data.frame()
varexp_axis1 <- eigenvalues[2,1] *100 
varexp_axis2 <- eigenvalues[2,2] *100 


ggplot() +
  geom_hline(yintercept=0, linetype="dashed", color = gray(.80), linewidth=0.6) +
  geom_vline(xintercept=0, linetype="dashed", color = gray(.80), linewidth=0.6) +
  geom_point(data = TAB_loci, aes(x=RDA1*3, y=RDA2*3), colour = '#CD24B2', size = 2, alpha = 0.8) + #"#F9A242FF"
  geom_segment(data = TAB_var, aes(xend=RDA1, yend=RDA2, x=0, y=0), colour="black", linewidth=0.15, linetype=1, arrow=arrow(length = unit(0.02, "npc"))) +
  geom_text(data = TAB_var, aes(x=1.1*RDA1, y=1.1*RDA2, label = row.names(TAB_var)), size = 3) +
  xlab(paste0("RDA 1 (",round(varexp_axis1,1),"%)")) + 
  ylab(paste0("RDA 2 (",round(varexp_axis2,1),"%)")) +
  facet_wrap(~paste0(x$set_name, " (n=",length(x$set_snps),")")) +
  guides(color=guide_legend(title="Locus type")) +
  theme_bw(base_size = 11) +
  theme(panel.grid = element_blank(), 
        plot.background = element_blank(), 
        panel.background = element_blank(), 
        strip.text = element_text(size=11))

}

2.2.1 Without correction for population structure

Code
rda_biplots <- lapply(snp_sets, function(set) make_RDAbiplot(x=set, pop_structure = F))

grid_RDAbiplots <- plot_grid(plotlist=rda_biplots, nrow=2)

ggsave(here("figs/RDA/RDAbiplots_AllSNPsets_WithoutCorrection.pdf"), grid_RDAbiplots, width = 12, height=8)

# We may want to save a plot with only some of the SNP sets
# plot_grid(rda_biplots[[1]],rda_biplots[[3]],rda_biplots[[4]],nrow=1) %>% 
#   ggsave(here("figs/RDA/RDAbiplots_SNPsets_SelectedPlots.pdf"), ., width = 14, height=6)

grid_RDAbiplots

2.2.2 With correction for population structure

Code
rda_biplots <- lapply(snp_sets, function(set) make_RDAbiplot(x=set, pop_structure = T))

grid_RDAbiplots <- plot_grid(plotlist=rda_biplots, nrow=2)

ggsave(here("figs/RDA/RDAbiplots_AllSNPsets_WithCorrection.pdf"), grid_RDAbiplots, width = 12, height=8)

# We may want to save a plot with only some of the SNP sets
# plot_grid(rda_biplots[[1]],rda_biplots[[3]],rda_biplots[[4]],nrow=1) %>% 
#   ggsave(here("figs/RDA/RDAbiplots_SNPsets_SelectedPlots.pdf"), ., width = 14, height=6)

grid_RDAbiplots

3 Adaptive landscape

We project the genetic composition across the landscape as estimated by the RDA models.

SNP set used for projecting the adaptive landscape:

Code
# SNP set
selected_snp_set <- "all_cand" # All candidate SNPs

Correction for population structure ?

Code
pop_structure <- "uncorrected" # "uncorrected" or "corrected"
# I did not write the code to project the index based on a RDA corrected for population structure
# because I would need for that to project the population structure across the landscape
Code
# Load the function to make the projections
source(here("scripts/functions/project_adaptive_index.R"))

# calculate the genetic index across the landscape
ai_proj <- project_adaptive_index(clim_var = clim_var,
                                  K = 2,
                                  snp_set = snp_sets[[selected_snp_set]],  # SNP set
                                  method = rda_method, # Loadings or predict
                                  pop_structure = pop_structure)

RDA_proj <- lapply(ai_proj, function(x) rasterToPoints(x))

# The adaptive index is scaled between 0 and 1
for(i in 1:length(RDA_proj)){
  RDA_proj[[i]][,3] <- (RDA_proj[[i]][,3]-min(RDA_proj[[i]][,3]))/(max(RDA_proj[[i]][,3])-min(RDA_proj[[i]][,3]))
}

# formatting the data frames for ggplot
TAB_RDA <- as.data.frame(do.call(rbind, RDA_proj[1:2]))
colnames(TAB_RDA)[3] <- "value"
TAB_RDA$variable <- factor(c(rep("RDA1", nrow(RDA_proj[[1]])), rep("RDA2", nrow(RDA_proj[[2]]))), levels = c("RDA1","RDA2"))

# map options
point_size=2
x_limits = c(-10, 15)
y_limits = c(31, 52)
ggtitle <- paste0(snp_sets[[selected_snp_set]]$set_name," - RDA ", pop_structure, " - ", rda_method," method")

# Mapping
ggmap <- ggplot(data = TAB_RDA) + 
  geom_sf(data = world, fill="gray98") + 
  scale_x_continuous(limits = x_limits) +
  scale_y_continuous(limits = y_limits) + 
  geom_raster(aes(x = x, y = y, fill = cut(value, breaks=seq(0, 1, length.out=11), include.lowest = T))) + 
  scale_fill_viridis_d(alpha = 0.8, direction = -1, option = "A") +
  xlab("Longitude") + ylab("Latitude") +
  guides(fill=guide_legend(title="Genetic index")) +
  facet_grid(~ variable) +
  ggtitle(ggtitle) +
  theme_bw(base_size = 11) +
  theme(panel.grid = element_blank(), plot.background = element_blank(), panel.background = element_blank(), strip.text = element_text(size=11))

# Save the map
ggmap %>% ggsave(here(paste0("figs/RDA/AdaptiveIndexMap_",
                             selected_snp_set,"_",
                             rda_method,"_PS",
                             pop_structure,".pdf")), ., width=10,height=6)

ggmap %>% ggsave(here(paste0("figs/RDA/AdaptiveIndexMap_",
                             selected_snp_set,"_",
                             rda_method,"_PS",
                             pop_structure,".png")), ., width=10,height=6)

# to rm the title, use theme(plot.title = element_blank())

4 Genomic offset predictions

4.1 Using spatial points

Predicting the genomic offset of the studied populations under future climates.

Code
# Function to calculate the RDA genetic offset for spatial points
# ===============================================================

# snp_set = list with information about the SNP set: its name, code, list of SNPs and the RDA model fitted on this set.

# K = number of RDA axes used to calculate the genomic offset

# pop_structure = using the RDA model corrected or not for population genetic structure

# clim_ref = climatic data used to fit the RDA model (the climate-of-origin of the populations)

# clim_pred = climatic data used for predictions (so either the future climate of the populations, 
            # or climate of the common gardens or the NFI plots)

# weights = if NULL, the adaptive index is not weighted by the relative importance of the RDA axes in 
# explaining the variance.

# method = "loadings" of "predict" (method used to predict the genomic offset)

# clim_var = vector of the selected climatic variables

# CG = specify whether the genomic offset predictions are done in the common gardens (T or F)

pred_GO_spatial_points <- function(snp_set, 
                                   K,
                                   pop_structure,
                                   clim_ref, 
                                   clim_pred, 
                                   clim_var,
                                   weights = NULL, 
                                   CG = F,
                                   method){

if(pop_structure == T){
  selected_mod <- snp_set$rda_model_corrected
} else {
  selected_mod <- snp_set$rda_model_uncorrected
}
  
# weights based on the variance explained by the different axes
weights <- (selected_mod$CCA$eig/sum(selected_mod$CCA$eig))[1:K]

if(method == "predict"){
  pred_ref <- predict(selected_mod, clim_ref, type = "lc")[,1:K]
  pred_fut <- predict(selected_mod, clim_pred, type = "lc")[,1:K]
  
  if(!is.null(weights)){
    for(i in 1:K){
      pred_ref[,i] <- pred_ref[,i] * weights[i]
      pred_fut[,i] <- pred_fut[,i] * weights[i]
    }
    }
  
  list_AI <- list(pred_ref,pred_fut)
}

if(method == "loadings"){
  
  list_AI <- lapply(list(clim_ref,clim_pred), function(df){
    
    lapply(1:K, function(i){
      
      
      # Below we calculate the adaptive index for the axis i
      # For that, we multiply the value of the variables at the location of the population
      # by the loadings of the variables along the axis i
      
      ai <- df %>%
        dplyr::select(any_of(clim_var)) %>% 
        apply(1, function(x) sum(x * selected_mod$CCA$biplot[,i]))
      
      
      if(!is.null(weights)){ai <- ai * weights[i]}
      
      }) %>% 
      setNames(paste0("RDA", 1:length(.))) %>% 
      as.data.frame()
  })
  }

if(CG==F){
  
eucloffset <- unlist(lapply(1:nrow(list_AI[[1]]), function(x) dist(rbind(list_AI[[1]][x,], list_AI[[2]][x,]), method = "euclidean")))

} else if(CG == T){
  
  if(pop_structure == F){
    
eucloffset <- lapply(1:nrow(list_AI[[2]]), function(x) 
  as.matrix(dist(rbind(list_AI[[2]][x,],list_AI[[1]]), method = "euclidean"))[2:(nrow(list_AI[[1]])+1),1]) %>% 
  setNames(clim_pred$cg) %>% 
  as_tibble %>% 
  mutate(pop = clim_ref$pop, .before = 1) 
  
  } else {
    
eucloffset <- lapply(1:nrow(list_AI[[1]]), function(x){
      
      mat_go <- as.matrix(dist(rbind(list_AI[[1]][x,], list_AI[[2]][((x-1)*5+1):((x-1)*5+5),]), method = "euclidean"))[2:(nrow(list_AI[[2]])/nrow(list_AI[[1]])+1),1]
      
      tibble(cg=unique(clim_pred$cg), go = mat_go)
      }
      
      ) %>% 
      setNames(clim_ref$pop) %>% 
      bind_rows(.id="pop") %>% 
      pivot_wider(names_from = "cg", values_from = "go")
    }
  }

return(eucloffset)
}
Code
# Predict genomic offset for each set of SNPs
snp_sets <- lapply(snp_sets, function(x){

# =====================================================================
# Genomic offset prediction without correction for population structure
# =====================================================================
   
  # For each GCM
x$go_uncorrected <- lapply(names(clim_dfs$clim_pred), function(gcm){
  
pred_GO_spatial_points(snp_set = x,
                       clim_var = clim_var,
                       pop_structure = F,
                       K = 2, # why K=2??
                       clim_ref = clim_dfs$clim_ref,
                       clim_pred = clim_dfs$clim_pred[[gcm]],
                       weights = T,
                       method = rda_method)

}) %>% setNames(names(clim_dfs$clim_pred))

# =====================================================================
# Genomic offset predictions with correction for population structure
# =====================================================================

x$go_corrected <- lapply(names(clim_dfs$clim_pred), function(gcm){
  
pred_GO_spatial_points(snp_set = x,
                       clim_var = clim_var,
                       pop_structure = T,
                       K = 2,
                       clim_ref = clim_dfs$clim_ref %>% inner_join(PCs, by="pop"),
                       clim_pred = clim_dfs$clim_pred[[gcm]] %>% inner_join(PCs, by="pop"),
                       weights = T,
                       method = rda_method)

}) %>% setNames(names(clim_dfs$clim_pred))

return(x)
})

4.1.1 Relationship with Euclidean distance

Code
source(here("scripts/functions/make_eucli_plot.R"))

# Calculate the Euclidean climatic distance
list_dist_env <- clim_dfs$clim_pred %>% lapply(function(clim_pred){
  
Delta = clim_dfs$clim_ref %>% dplyr::select(any_of(clim_var)) - clim_pred %>% dplyr::select(any_of(clim_var)) 
dist_env = sqrt( rowSums(Delta^2) )

})

# Main gene pools (for the figures)
gps <- readRDS(here("data/GenomicData/MainGenePoolPopulations.rds")) %>%  arrange(pop)
Code
par(mfrow=c(1,2))

lapply(snp_sets, function(x) {

lapply(names(list_dist_env), function(gcm){
  
make_eucli_plot(
  X = list_dist_env[[gcm]],
  Y = x$go_uncorrected[[gcm]],
  colors = gps$color_main_gp_pop,
  color_names = gps$main_gp_pop,
  ylab = "RDA genomic offset",
  legend_position="topleft",
  plot_title = paste0(x$set_name," - ", gcm))

})
}) 

Code
# ====================
# Making scatter plots
# ====================

# Axis limits
# ===========
max_go <- lapply(snp_sets, function(z){
  vec_cor <- z$go_corrected %>% unlist(use.names = F)
  vec_uncor <- z$go_uncorrected %>% unlist(use.names = F)
  
  c(vec_uncor,vec_cor) 
}) %>% unlist() %>% max()
max_go <- max_go + 0.01

range_eucli <- list_dist_env %>% unlist() %>% range()

# Run the function
# ================
lapply(snp_sets, function(set_i) {

  lapply(c("uncorrected","corrected"), function(correction){

p <- lapply(names(list_dist_env), function(gcm){
  
make_ggscatterplot(
  x = list_dist_env[[gcm]],
  y = set_i[[paste0("go_",correction)]][[gcm]],
  title=gcm,
  range_eucli = range_eucli,
  max_go = max_go)

})

# remove y-labels to graphs in the second column
p[[2]] <- p[[2]] + ylab("")
p[[4]] <- p[[4]] + ylab("")


# remove x-labels to graphs in the second and third rows
p[[1]] <- p[[1]] + xlab("")
p[[2]] <- p[[2]] + xlab("")
p[[3]] <- p[[3]] + xlab("")

p[[6]] <- get_legend(p[[1]])

for(i in 1:5){p[[i]] <- p[[i]]  +  theme(legend.position = "none")} 


plot_grid(plotlist=p, nrow = 3) %>% 
  ggsave(here(paste0("figs/RDA/ScatterPlotEucliDistance_",set_i$set_code,"_PS",correction,"_",rda_method,".pdf")), 
         .,
         width=7,
         height=8,
         device="pdf")

})
  
})

4.1.2 Maps

Code
# Function to make the genomic offset maps
source(here("scripts/functions/make_go_map.R"))

# Population coordinates
pop_coord <-  readRDS(here(paste0("data/ClimaticData/MaritimePinePops/ClimatePopulationLocationPointEstimates_ReferencePeriods_noADJ.rds")))[[1]]$ref_means %>% dplyr::select(pop,longitude,latitude)

# Generate the maps for each set of SNPs and each GCM
lapply(snp_sets, function(x) {

  lapply(c("uncorrected","corrected"), function(correction){
    
    # Find minimum and maximum values of genomic offset for the maps
go_limits <- lapply(names(x[[paste0("go_",correction)]]), function(gcm){
x[[paste0("go_",correction)]][[gcm]]
})  %>%  unlist() %>% range()

# The minimum GO value is very very small, almost zero, so we fix it to zero.
go_limits[[1]] <- 0

go_maps <- lapply(names(x[[paste0("go_",correction)]]), function(gcm){

#plot_title <- paste0(x$set_name, " - RDA ", correction)
df <- pop_coord %>% mutate(GO = x[[paste0("go_",correction)]][[gcm]])
  

make_go_map(df = df,
            point_size = 3, 
            go_limits = go_limits,
            type = "pop",
            cg_coord = NULL,
            plot_title = gcm)

})

legend_maps  <- get_legend(go_maps[[1]])

go_maps <- lapply(go_maps, function(y) y + theme(legend.position = "none"))

go_maps$legend_maps <- legend_maps

go_maps <-plot_grid(plotlist=go_maps)

# save the figures
ggsave(here(paste0("figs/RDA/GOMaps_PopLocations_",x$set_code,"_PS",correction,"_",rda_method,".pdf")), go_maps, width=15,height=10, device="pdf")
ggsave(here(paste0("figs/RDA/GOMaps_PopLocations_",x$set_code,"_PS",correction,"_",rda_method,".png")), go_maps, width=10,height=6)
  


# =========
# Add title
# =========
title <- ggdraw() + 
  draw_label(
    paste0(x$set_name," - RDA ",correction," - ", rda_method, " method"),
    fontface = 'bold',
    x = 0,
    hjust = 0
  ) +
  theme(plot.margin = margin(0, 0, 0, 7))

# merge title and plots
plot_grid(
  title, go_maps,
  ncol = 1,
  rel_heights = c(0.1, 1))

  })
  
})

For each GCM, we can attribute the value 1 to the top five populations with the highest genomic offset and we can attribute the value 0 to the other populations. We then count the number of 1 for each population, which gives the table and map below:

Code
source(here("scripts/functions/make_high_go_pop_maps.R"))

high_go_pops <- make_high_go_pop_maps(pop_coord=pop_coord,
                                      list_go = snp_sets$common_cand$go_uncorrected,
                                      ggtitle="Common candidate SNPs - RDA uncorrected",
                                      nb_id_pop = 5) # number of selected populations
Warning: A numeric `legend.position` argument in `theme()` was deprecated in ggplot2 3.5.0.
ℹ Please use the `legend.position.inside` argument of `theme()` instead.
Code
saveRDS(high_go_pops, file = here(paste0("outputs/RDA/high_go_pops_",rda_method,".rds")))

high_go_pops[[1]] %>% kable_mydf
pop longitude latitude GFDL-ESM4 IPSL-CM6A-LR MPI-ESM1-2-HR MRI-ESM2-0 UKESM1-0-LL sum_go
CAD -6.42 43.54 0 0 0 0 0 0
LAM -6.22 43.56 0 0 0 0 0 0
SIE -6.49 43.53 0 0 0 0 0 0
ARM -6.46 43.30 0 0 0 0 0 0
ALT -6.49 43.28 0 0 0 0 0 0
BAY -2.88 41.52 0 0 0 0 0 0
PUE -6.63 43.55 0 0 0 0 0 0
CAS -6.98 43.50 0 0 0 0 0 0
SAL -3.06 41.83 0 0 0 0 0 0
BON -1.66 39.99 0 0 0 0 0 0
MIM -1.30 44.13 0 0 0 0 0 0
PIA 9.46 42.02 0 0 0 0 0 0
SEG -8.45 42.82 0 0 0 0 0 0
CAR -4.28 41.17 0 0 0 0 0 0
HOU -1.15 45.18 0 0 0 0 0 0
VAL -4.31 40.52 0 0 0 0 0 0
CUE -4.48 41.37 0 0 0 0 0 0
PIE 9.04 41.97 0 0 0 0 0 0
COC -4.50 41.25 0 0 0 0 0 0
VER -1.09 45.55 0 0 0 0 0 0
CEN -4.49 40.28 0 0 0 0 0 0
SAC -8.36 42.12 0 0 0 0 0 0
STJ -2.03 46.76 0 0 0 0 0 0
PET -1.30 44.06 0 0 1 0 0 1
OLO -1.83 46.57 0 0 0 1 0 1
QUA -0.36 38.97 1 0 0 0 0 1
ARN -5.12 40.19 1 0 0 0 0 1
LEI -8.96 39.78 0 1 0 0 1 2
PLE -2.34 47.78 0 0 1 1 0 2
OLB -0.62 40.17 1 0 1 0 0 2
MAD -5.23 35.18 0 1 0 1 1 3
COM -3.95 36.83 1 1 0 0 1 3
TAM -5.02 33.60 0 1 1 1 1 4
ORI -2.35 37.53 1 1 1 1 1 5
Code
high_go_pops[[2]]

4.1.3 Comparing GO predictions

We look at the correlation across the different genomic offset predictions at the location of the populations, i.e. those based on all SNPs and those based on sets of candidates or control SNPs.

Code
lapply(names(snp_sets[[1]]$go_corrected), function(gcm){
  
  lapply(c("uncorrected","corrected"), function(correction){
    
    lapply(snp_sets, function(x) x[[paste0("go_",correction)]][[gcm]]) %>% as_tibble() %>% bind_cols(tibble(PCs)[,"pop"])
  
}) %>% 
    setNames(c("(uncorr.)","(corr.)")) %>% 
    bind_rows(.id="correction") %>% 
    pivot_wider(names_from = "correction", values_from = names(snp_sets), names_sep = " ") %>% 
    dplyr::select(-pop) %>% 
    cor() %>% 
    corrplot(method = 'number',
               type = 'lower', 
               diag = FALSE,
               mar=c(0,0,2,0),
               title=gcm,
               tl.cex=1,
               number.cex=1)
  
})

4.2 Using rasters

We predict the genomic offset across the maritime pine distribution only for the three sets of candidate SNPs.

For these predictions, we use RDA models that are not corrected for population genetic structure. Correcting for population genetic structure would require projecting it across the entire maritime pine distribution. However, the exact geographical distribution of gene pools in maritime pine (i.e., the spatial distribution of population genetic structure) is not known. Interpolating this structure would introduce substantial noise into the predictions. Therefore, we only use RDA models without correction for population genetic structure in the genomic offset predictions presented in this section.

Code
popstructure_correction = "PSuncorrected"
Code
# Function to calculate the RDA genetic offset using rasters
# ==========================================================

# Arguments
# =========
# snp_set = list with at least the RDA models 
# clim_var = selected climatic variables
# K = number of RDA axes used to calculate the genomic offset
# gcm = Global Climate Model of the raster with future climatic data
# clim_ref_adj = TRUE or FALSE, specify whether the point estimate climatic data used to scale the rasters should be adjusted or not for elevation
# ref_period = the reference period used to calculate the adaptive index, can be 1901-1950 or 1960-1991
# method = `loadings` or `predict` 
# popstructure_correction = using the RDA model corrected or not for population genetic structure


pred_GO_rasters <- function(snp_set, 
                            clim_var,
                            K, 
                            popstructure_correction,
                            gcm,
                            range_buffer = shapefile(here('data/Mapping/PinpinDistriEUforgen_NFIplotsBuffer10km.shp')),
                            method,
                            clim_ref_adj = FALSE,
                            ref_period = "ref_1901_1950"){

if(popstructure_correction == "PScorrected"){
  selected_mod <- snp_set$rda_model_corrected
} else if(popstructure_correction =="PSuncorrected"){
  selected_mod <- snp_set$rda_model_uncorrected
}
  
# Load point estimate climatic data of the reference period
if(clim_ref_adj==TRUE){adj <- "ADJ"} else {adj <- "noADJ"}  
clim_ref_pe <- readRDS(here(paste0("data/ClimaticData/MaritimePinePops/ClimatePopulationLocationPointEstimates_ReferencePeriods_",adj,".rds")))[[ref_period]]
  
# Extract scaling parameters, i.e. mean and variance  
scale_params <- lapply(clim_var, function(x){
    
    vec_var <- clim_ref_pe$ref_means[,x] %>% pull()
    
    list(mean = mean(vec_var), sd = sd(vec_var))
    
}) %>% setNames(clim_var)
  
  
# Rasters with climatic data for the reference period
path <- here(paste0("data/ClimaticData/ClimateDTRasters/1km_",clim_ref_pe$range[[1]],"-",clim_ref_pe$range[[2]],"_Extent-JulietteA/"))
tif_paths <- lapply(clim_var, function(x) paste0(path,"/",x,".tif"))
ref_rasts <- raster::stack(tif_paths)
  
# Raster with future climatic data
path <- here(paste0("data/ClimaticData/ClimateDTRasters/1km_",gcm,"_2041-2070_ssp370_Extent-JulietteA/"))
tif_paths <- lapply(clim_var, function(x) paste0(path,"/",x,".tif"))
fut_rasts <- raster::stack(tif_paths)

# checking that the CRS is the same for the buffer and the rasters
if(identical(crs(range_buffer),crs(ref_rasts))==FALSE){
  stop(paste0("CRS of the range buffer is not the same as the CRS of the reference period raster."))} 
if(identical(crs(range_buffer),crs(fut_rasts))==FALSE){
  stop(paste0("CRS of the range buffer is not the same as the CRS of future climate rasters."))} 

# Mask with the range if supplied
if(!is.null(range_buffer)){
  ref_rasts <- mask(ref_rasts, range_buffer)
  fut_rasts <- mask(fut_rasts, range_buffer)}


# extract coordinates and climatic values, and mean-center the climatic data
ref_vals <- as.data.frame(rasterToPoints(ref_rasts))
for(i in clim_var){ref_vals[,i] <- (ref_vals[,i] - scale_params[[i]]$mean) / scale_params[[i]]$sd}
fut_vals <- as.data.frame(rasterToPoints(fut_rasts))
for(i in clim_var){fut_vals[,i] <- (fut_vals[,i] - scale_params[[i]]$mean) / scale_params[[i]]$sd}


# Predict genomic offset for each pixel based on the loadings of the climatic variables
if(method == "loadings"){
  
    ref_proj <- list()
    fut_proj <- list()
    go_proj <- list()
  
    # Projection for each RDA axis
    for(i in 1:K){
      
      # Calculate adaptive index for the reference period
      ref_rast <- ref_rasts[[1]]
      ref_rast[!is.na(ref_rast)] <- as.vector(apply(ref_vals[,clim_var], 1, function(x) sum( x * selected_mod$CCA$biplot[,i])))
      names(ref_rast) <- paste0("RDA_ref_", as.character(i))
      ref_proj[[i]] <- ref_rast
      names(ref_proj)[i] <- paste0("RDA", as.character(i))
      
      # Calculate adaptive index under future climates
      fut_rast <- fut_rasts[[1]]
      fut_rast[!is.na(fut_rast)] <- as.vector(apply(fut_vals[,clim_var], 1, function(x) sum( x * selected_mod$CCA$biplot[,i])))
      names(fut_rast) <- paste0("RDA_fut_", as.character(i))
      fut_proj[[i]] <- fut_rast
      names(fut_proj)[i] <- paste0("RDA", as.character(i))
      
      # Calculate genetic offset based on a single RDA axis
      go_proj[[i]] <- abs(ref_proj[[i]] - fut_proj[[i]])
      names(go_proj)[i] <- paste0("RDA", as.character(i))
    }
}


  
# Predict genomic offset for each pixel based on predict.RDA
  if(method == "predict"){
    
    # Prediction with the RDA model under reference-period climates and future climates
    ref_pred <- predict(selected_mod, ref_vals[,-c(1,2)], type = "lc")
    fut_pred <- predict(selected_mod, fut_vals[,-c(1,2)], type = "lc")
    
    ref_proj <- list()
    fut_proj <- list()
    go_proj <- list()
    
    for(i in 1:K){
      
      # Calculate adaptive index for the reference period
      ref_rast <- rasterFromXYZ(data.frame(ref_vals[,c(1,2)], Z = as.vector(ref_pred[,i])), crs = crs(ref_rasts))
      names(ref_rast) <- paste0("RDA_ref_", as.character(i))
      ref_proj[[i]] <- ref_rast
      names(ref_proj)[i] <- paste0("RDA", as.character(i))
      
      # Calculate adaptive index under future climates
      fut_rast <- rasterFromXYZ(data.frame(ref_vals[,c(1,2)], Z = as.vector(fut_pred[,i])), crs = crs(ref_rasts))
      names(fut_rast) <- paste0("RDA_fut_", as.character(i))
      fut_proj[[i]] <- fut_rast
      names(fut_proj)[i] <- paste0("RDA", as.character(i))
      
      # Calculate genetic offset based on a single RDA axis
      go_proj[[i]] <- abs(ref_proj[[i]] - fut_proj[[i]])
      names(go_proj)[i] <- paste0("RDA", as.character(i))
    }
  }

  # Weights based on axis eigen values
  weights <- selected_mod$CCA$eig/sum(selected_mod$CCA$eig)
  
  # Weight the current and future adaptive indices based on the eigen values of the associated axes
  ref_proj_weighted <- do.call(cbind, lapply(1:K, function(x) rasterToPoints(ref_proj[[x]])[,-c(1,2)]))
  ref_proj_weighted <- as.data.frame(do.call(cbind, lapply(1:K, function(x) ref_proj_weighted[,x]*weights[x])))
  fut_proj_weighted <- do.call(cbind, lapply(1:K, function(x) rasterToPoints(fut_proj[[x]])[,-c(1,2)]))
  fut_proj_weighted <- as.data.frame(do.call(cbind, lapply(1:K, function(x) fut_proj_weighted[,x]*weights[x])))
  
  # Predict a global genetic offset, incorporating the K first axes weighted by their eigen values
  go_global_proj <- go_proj[[1]]
  go_global_proj[!is.na(go_global_proj)] <- unlist(lapply(1:nrow(ref_proj_weighted), function(x) dist(rbind(ref_proj_weighted[x,], fut_proj_weighted[x,]), method = "euclidean")))
  names(go_global_proj) <- "global_offset"
  
  # Return :
    # projections of AI for current and future climates for each RDA axis
    # prediction of genomic offset for each RDA axis
    # a global genomic offset, which incorporates the K first axes weighted by their eigen values
    # weights of the RDA axes
  return(list(ref_proj = ref_proj, fut_proj = fut_proj, go_proj = go_proj, go_global_proj = go_global_proj, weights = weights[1:K]))
  
  # We can chose to return only the global genomic offset, so that the function output is smaller:
  #  return(list(go_global_proj = go_global_proj))

}
Code
# For each snp set and each GCM, we apply the function pred_GO_rasters
go_rasters <- snp_sets[1:3] %>% lapply(function(snp_set){
  lapply(names(clim_dfs$clim_pred), function(gcm){
  
  pred_GO_rasters(snp_set,
                  clim_var,
                  K=2,
                  popstructure_correction,
                  gcm = gcm, 
                  method = rda_method)}) %>% 
    setNames(names(clim_dfs$clim_pred)) %>% 
    saveRDS(file=here(paste0("outputs/RDA/go_rasters_",snp_set$set_code,"_",rda_method,"_",popstructure_correction,".rds")))
  })

4.2.1 GCM specific maps

Code
# Map options
# ===========
point_size = 2
x_limits = c(-10, 15)
y_limits = c(31, 52)
Code
names(snp_sets)[1:3] %>% lapply(function(x){

go_rasters <-readRDS(file=here(paste0("outputs/RDA/go_rasters_",snp_sets[[x]]$set_code,"_",rda_method,"_",
                                      popstructure_correction,".rds")))

go_dfs <- lapply(names(go_rasters), function(gcm){
  rasterToPoints(go_rasters[[gcm]]$go_global_proj) %>% as_tibble()}) %>% 
  setNames(names(go_rasters))


## GCM specific maps
####################
go_limits <- go_dfs %>% lapply(function(x) x %>% pull(global_offset)) %>% unlist() %>% range()

# The minimum GO value is very very small, almost zero, so we fix it to zero.
go_limits[[1]] <- 0

go_maps <- lapply(names(clim_dfs$clim_pred), function(gcm){

ggplot(data = go_dfs[[gcm]]) +
  geom_sf(data = world, fill="gray98") +
  scale_x_continuous(limits = x_limits) +
  scale_y_continuous(limits = y_limits) +
  geom_raster(aes(x = x, y = y, fill = global_offset), alpha = 1) +
  scale_fill_gradient2(low="blue", mid= "yellow", high="red",
                       midpoint=(max(go_limits[[2]])-min(go_limits[[1]]))/2,
                       limits=go_limits,
                       name = "Genomic offset") +
  xlab("Longitude") + ylab("Latitude") +
  ggtitle(gcm) +
  theme_bw(base_size = 11) +
  theme(panel.grid = element_blank(),
        plot.background = element_blank(),
        panel.background = element_blank(),
        #legend.position = c(0.85,0.15),
        legend.box.background = element_rect(colour = "gray80"),
        strip.text = element_text(size=11))

})

legend_maps  <- get_legend(go_maps[[1]])

go_maps <- lapply(go_maps, function(y) y + theme(legend.position = "none"))

go_maps$legend_maps <- legend_maps

go_maps <- plot_grid(plotlist=go_maps)

ggsave(here(paste0("figs/RDA/GOMap_",x,"_",rda_method,"_",
                                      popstructure_correction,".pdf")), go_maps, width=11,height=7, device="pdf")
ggsave(here(paste0("figs/RDA/GOMap_",x,"_",rda_method,"_",
                                      popstructure_correction,".png")), go_maps, width=11,height=7)

})

We visualize the generated maps.

All candidate SNPs

Common candidate SNPs

Candidate SNPs with correction for population genetic structure

4.2.2 Maps of GO mean across GCM predictions

We generate a map with the mean of genomic offset predictions across GCMs for the set with all candidate SNPs.

Code
go_rasters <-readRDS(file=here(paste0("outputs/RDA/go_rasters_all_cand_",rda_method,"_",popstructure_correction,".rds")))

go_dfs <- lapply(names(go_rasters), function(gcm){
  rasterToPoints(go_rasters[[gcm]]$go_global_proj) %>% as_tibble()}) %>% 
  setNames(names(go_rasters))

go_df_mean <- go_dfs %>% 
  bind_rows(.id="GCM") %>% 
  pivot_wider(names_from = "GCM", values_from = "global_offset") %>% 
  rowwise() %>% 
  mutate(mean_offset = mean(c_across(`GFDL-ESM4`:`UKESM1-0-LL`), na.rm = TRUE)) %>% 
  ungroup()

saveRDS(go_df_mean, file=here(paste0("outputs/RDA/go_df_mean_all_cand_",rda_method,"_",popstructure_correction,".rds")))
Code
go_df_mean <- readRDS(file=here(paste0("outputs/RDA/go_df_mean_all_cand_",rda_method,"_",popstructure_correction,".rds")))

p <- ggplot(data = go_df_mean) +
  geom_sf(data = world, fill="gray98") +
  scale_x_continuous(limits = x_limits) +
  scale_y_continuous(limits = y_limits) +
  geom_raster(aes(x = x, y = y, fill = mean_offset), alpha = 1) +
  scale_fill_gradient2(low="blue", mid= "yellow", high="red",
                       midpoint=(max(go_df_mean$mean_offset)-min(go_df_mean$mean_offset))/2,
                       limits=c(0, max(go_df_mean$mean_offset)),
                       name = "Genomic offset from RDA") +
  xlab("Longitude") + ylab("Latitude") +
  theme_bw(base_size = 11) +
  theme(panel.grid = element_blank(),
        plot.background = element_blank(),
        panel.background = element_blank(),
        legend.position = c(0.8,0.15),
        legend.box.background = element_rect(colour = "gray80"),
        legend.title = element_text(size=10),
        strip.text = element_text(size=11))

ggsave(here(paste0("figs/RDA/GOmeanProjections_all_cand_",rda_method,"_",
                                      popstructure_correction,".pdf")), p, width=6, height=6, device="pdf")
Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted
ℹ Consider using `geom_tile()` instead.
Code
ggsave(here(paste0("figs/RDA/GOmeanProjections_all_cand_",rda_method,"_",
                                      popstructure_correction,".png")), p, width=6, height=6)
Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted
ℹ Consider using `geom_tile()` instead.
Code
p
Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted
ℹ Consider using `geom_tile()` instead.

4.2.3 Corr btw predictions with rasters or spatial points

We check that the genomic offset predictions we obtained with the spatial points are highly correlated (ie similar) to those obtained with the rasters.

Comment When we extract the genomic offset values from the rasters at the location of the populations, we have missing values for some populations. Indeed, some populations are not included in the buffer of the species range that I used, based on the EUFORGEN distribution and 10km around the NFI plots. We may want to modify the buffer of the species range to include those populations!

Code
# checking correlations
names(snp_sets)[1:3] %>% lapply(function(x){
  
cor_go <- lapply(names(clim_dfs$clim_pred), function(gcm){

go_rasters <- readRDS(file=here(paste0("outputs/RDA/go_rasters_",
                                       snp_sets[[x]]$set_code,"_",
                                       rda_method,"_",
                                       popstructure_correction,".rds"))) 
  
go_rast <- raster::extract(go_rasters[[gcm]]$go_global_proj, pop_coord[,c("longitude","latitude")])

cor(snp_sets[[x]]$go_uncorrected[[gcm]],go_rast,use= "complete.obs")
  
}) %>% unlist() 
  
tibble(gcm=names(clim_dfs$clim_pred),
       cor_go=cor_go)
  
}) %>% 
  setNames(names(snp_sets)[1:3]) %>% 
  bind_rows(.id="snp_set") %>% 
  kable_mydf()
snp_set gcm cor_go
all_cand GFDL-ESM4 0.98
all_cand IPSL-CM6A-LR 0.99
all_cand MPI-ESM1-2-HR 0.99
all_cand MRI-ESM2-0 0.99
all_cand UKESM1-0-LL 0.99
common_cand GFDL-ESM4 0.99
common_cand IPSL-CM6A-LR 0.98
common_cand MPI-ESM1-2-HR 0.98
common_cand MRI-ESM2-0 0.99
common_cand UKESM1-0-LL 0.99
corrected_cand GFDL-ESM4 0.99
corrected_cand IPSL-CM6A-LR 0.97
corrected_cand MPI-ESM1-2-HR 0.99
corrected_cand MRI-ESM2-0 0.99
corrected_cand UKESM1-0-LL 0.99

This is ok!

5 Validation - NFI plots

Code
# Load the climatic data of the NFI plots.
nfi_clim <- readRDS(here("data/ClimaticData/NFIplots/NFIclimate.rds"))
  
# Keep only the climatic variables of interest and scale the climatic data
source(here("scripts/functions/generate_scaled_nfi_clim_datasets.R"))
nfi_dfs <- generate_scaled_nfi_clim_datasets(clim_var, clim_ref = nfi_clim$clim_ref, clim_pred = nfi_clim$clim_survey)

# ========================================
# Assign PCs Based on Geographic Proximity
# ========================================
pop_coords_withPCs <- readRDS(here("data/ClimaticData/MaritimePinePops/ClimatePopulationLocationPointEstimates_ReferencePeriods_ADJ.rds"))[[1]][[2]][,c("pop","longitude","latitude")] %>% 
  inner_join(PCs, by="pop")

nfi_coords <- nfi_clim$clim_ref[,c("plotcode","longitude", "latitude")]

# Calculate the geographic distance between each NFI plot and population
dist_matrix <- distm(nfi_coords[,c("longitude", "latitude")],
                     pop_coords_withPCs[,c("longitude", "latitude")], fun = distHaversine)

# Find the closest population for each NFI plot
closest_pop_idx <- apply(dist_matrix, 1, which.min)

# Assign PC1, PC2, and PC3 values from the closest population to each NFI plot
nfi_coords <- nfi_coords %>%
  mutate(
    closest_pop = pop_coords_withPCs$pop[closest_pop_idx],
    PC1 = pop_coords_withPCs$PC1[closest_pop_idx],
    PC2 = pop_coords_withPCs$PC2[closest_pop_idx],
    PC3 = pop_coords_withPCs$PC3[closest_pop_idx]
  )


# ==========================
# Genomic offset predictions
# ==========================
snp_sets <- lapply(snp_sets, function(x){

# without correction for population structure  
x$go_nfi_uncorrected <- pred_GO_spatial_points(snp_set = x,
                                   K = 2,
                                   pop_structure = F,
                                   clim_var=clim_var,
                                   clim_ref = nfi_dfs$clim_ref,
                                   clim_pred = nfi_dfs$clim_pred,
                                   weights = T,
                                   method = rda_method)


# With correction for population structure
x$go_nfi_corrected <- pred_GO_spatial_points(snp_set = x,
                                   K = 2,
                                   pop_structure = T,
                                   clim_var=clim_var,
                                   clim_ref = nfi_dfs$clim_ref %>% inner_join(nfi_coords[,c("plotcode","PC1","PC2","PC3")], by="plotcode"),
                                   clim_pred = nfi_dfs$clim_pred %>% inner_join(nfi_coords[,c("plotcode","PC1","PC2","PC3")], by="plotcode"),
                                   weights = T,
                                   method = rda_method)

return(x)
})
Code
# checking missing data
# lapply(snp_sets, function(x) sum(is.na(x$go_nfi)))

# # Find minimum and maximum values of genomic offset for the maps
# go_limits <- lapply(snp_sets, function(snp_set) 
#                   c(snp_set$go_nfi_uncorrected, snp_set$go_nfi_corrected)  
#                   
#                     ) %>%  unlist() %>% range()
# # The minimum GO value is very very small, almost zero, so we fix it to zero.
# go_limits[[1]] <- 0


# map genomic offset predictions in the NFI plots 
lapply(snp_sets, function(x) {
  
  lapply(c("uncorrected","corrected"), function(correction){
    
    GO <- x[[paste0("go_nfi_",correction)]]
    
    # Find minimum and maximum values of genomic offset for the maps
    go_limits <- c(0,max(GO))
    
    df <- nfi_coords %>% 
      dplyr::select(contains("ude")) %>% 
      mutate(GO = GO)

    plot_title <- paste0(x$set_name, " - RDA ", correction)
    
    p <- make_go_map(
      df = df, 
      type="NFI",
      point_size = 0.5,
      go_limits = go_limits,
      legend_position = c(0.85,0.2),
      y_limits = c(35, 51),
      plot_title = plot_title)

# Figure for the SI
# =================
if(x$set_code=="common_cand"){ 
  p_SI <- p + theme(plot.title = element_blank())
  ggsave(here(paste0("figs/RDA/NFI_GOmap_CommonCandidateSNPs_PS",correction,"_",rda_method,".pdf")), p_SI, width=6,height=6, device="pdf")
  ggsave(here(paste0("figs/RDA/NFI_GOmap_CommonCandidateSNPs_PS",correction,"_",rda_method,".png")), p_SI, width=6,height=6)
  
  } else if(x$set_code=="control"){
    p_SI <- p + theme(plot.title = element_blank())
    ggsave(here(paste0("figs/RDA/NFI_GOmap_ControlSNPs_PS",correction,"_",rda_method,".pdf")), p_SI, width=6,height=6, device="pdf")
    ggsave(here(paste0("figs/RDA/NFI_GOmap_ControlSNPs_PS",correction,"_",rda_method,".png")), p_SI, width=6,height=6)
    
  } else if(x$set_code=="all"){
    p_SI <- p + theme(plot.title = element_blank())
    ggsave(here(paste0("figs/RDA/NFI_GOmap_AllSNPs_PS",correction,"_",rda_method,".pdf")), p_SI, width=6,height=6, device="pdf")
    ggsave(here(paste0("figs/RDA/NFI_GOmap_AllSNPs_PS",correction,"_",rda_method,".png")), p_SI, width=6,height=6)
    }

# Show maps in the Quarto document
# ================================
p
})
})

We look at the correlation across the different genomic offset predictions in the NFI plots.

Code
lapply(c("uncorrected","corrected"), function(correction){
  
  lapply(snp_sets, function(x) x[[paste0("go_nfi_",correction)]]) %>% as_tibble() %>% bind_cols(nfi_coords[,"plotcode"])
  
}) %>% 
    setNames(c("(uncorr.)","(corr.)")) %>% 
    bind_rows(.id="correction") %>% 
    pivot_wider(names_from = "correction", values_from = names(snp_sets), names_sep = " ") %>% 
    dplyr::select(-plotcode) %>% 
    cor() %>% 
    corrplot(method = 'number',
               type = 'lower', 
               diag = FALSE,
               mar=c(0,0,2,0),
               tl.cex=1,
               number.cex=1)

6 Validation - Common gardens

Code
cg_clim <- readRDS(here("data/ClimaticData/CommonGardens/ClimateCG.rds")) %>%  dplyr::select(cg,any_of(clim_var))
cg_coord <- readRDS(here("data/ClimaticData/CommonGardens/ClimateCG.rds")) %>% dplyr::select(cg,contains("ude"))
cg_names <- unique(cg_coord$cg)

# Generate scaled climatic datasets with climatic data at the location of the populations and at the location of the common gardens
cg_dfs <- generate_scaled_clim_datasets(clim_var, clim_pred = cg_clim)

# Create a dataset for genomic offset predictions with correction for population structure
cg_dfs_pcs <- expand.grid(cg_dfs$clim_pred$cg, PCs$pop) %>%
  rename(cg = Var1, pop = Var2) %>% 
  left_join(cg_dfs$clim_pred, by = "cg") %>%
  left_join(PCs, by = "pop")

# ==========================
# Genomic offset predictions
# ==========================
snp_sets <- lapply(snp_sets, function(x){

# without correction for population genetic structure
x$go_cg_uncorrected <- pred_GO_spatial_points(snp_set = x,
                                              K = 2,
                                              pop_structure = F,
                                              clim_var = clim_var,
                                              clim_ref = cg_dfs$clim_ref,
                                              clim_pred = cg_dfs$clim_pred,
                                              weights = T,
                                              method = rda_method,
                                              CG=T)

# with correction for population genetic structure
x$go_cg_corrected <- pred_GO_spatial_points(snp_set = x,
                                              K = 2,
                                              pop_structure = T,
                                              clim_var = clim_var,
                                              clim_ref = cg_dfs$clim_ref  %>% inner_join(PCs, by="pop"),
                                              clim_pred = cg_dfs_pcs,
                                              weights = T,
                                              method = rda_method,
                                              CG=T)
return(x)
})

We map genomic offset predictions at the locations of the populations

Code
lapply(cg_names, function(cg_name){
  
  lapply(snp_sets, function(x) {
    
    lapply(c("uncorrected", "corrected"), function(correction){
      
      df <- pop_coord %>%
        left_join(x$go_cg_uncorrected[,c("pop",cg_name)], by="pop") %>% 
        dplyr::rename(GO=all_of(cg_name))
      
      p <- make_go_map(df = df,
                       point_size = 3,
                       type = "CG",
                       go_limits = c(0, max(df$GO)),
                       cg_coord = filter(cg_coord, cg == cg_name),
                       plot_title = paste0(str_to_title(cg_name), " - ",x$set_name," - RDA ",correction, " - method ",rda_method))
      
      ggsave(filename = here(paste0("figs/RDA/GOmap_",x$set_code,"_",cg_name,"_PS",correction,"_",rda_method,".pdf")), p, device = "pdf",width=7,height=7)
   
 
 # We may want to save the figures without title or legend
 # p_noTitle <- p + theme(plot.title = element_blank(),
 #                     legend.position = "none")
 # ggsave(filename = here(paste0("figs/RDA/GOmap_",x$set_code,"_",cg_name,"_PS",correction,"_",rda_method,"__noTitle.pdf")), p_noTitle, device = "pdf",width=7,height=7)
    
p

      })
    })
  })

Correlations among genomic offset predictions in the common gardens.

Code
lapply(cg_names, function(cg_name){

  lapply(c("uncorrected","corrected"), function(correction){
    
    lapply(snp_sets, function(x) x[[paste0("go_cg_",correction)]]) %>% 
      
      lapply(function(set){set[,c("pop",cg_name)]}) %>%
      bind_rows(.id="set") %>% 
      pivot_wider(names_from = "set", values_from = all_of(cg_name))
    
    })  %>% 
    setNames(c("(uncorr.)","(corr.)")) %>% 
    bind_rows(.id="correction") %>% 
    pivot_wider(names_from = "correction", values_from = names(snp_sets), names_sep = " ") %>% 
    dplyr::select(-pop) %>% 
    
    cor() %>% 
    corrplot(method = 'number',type = 'lower', 
             diag = FALSE,mar=c(0,0,2,0),
             title=str_to_title(cg_name),
             number.cex=1,tl.cex=1)

})

Let’s save the genomic offset predictions for comparison with the other methods.

Code
snp_sets %>% saveRDS(file=here(paste0("outputs/RDA/go_predictions_",rda_method,".rds")))

7 Session information

Code
devtools::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.2 (2024-10-31)
 os       Ubuntu 22.04.5 LTS
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  fr_FR.UTF-8
 ctype    fr_FR.UTF-8
 tz       Europe/Paris
 date     2025-01-16
 pandoc   3.1.11 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package           * version date (UTC) lib source
 cachem              1.0.8   2023-05-01 [1] CRAN (R 4.4.0)
 class               7.3-23  2025-01-01 [4] CRAN (R 4.4.2)
 classInt            0.4-10  2023-09-05 [1] CRAN (R 4.4.0)
 cli                 3.6.2   2023-12-11 [1] CRAN (R 4.4.0)
 cluster             2.1.8   2024-12-11 [4] CRAN (R 4.4.2)
 codetools           0.2-19  2023-02-01 [4] CRAN (R 4.2.2)
 colorspace          2.1-0   2023-01-23 [1] CRAN (R 4.4.0)
 corrplot          * 0.92    2021-11-18 [1] CRAN (R 4.4.0)
 cowplot           * 1.1.3   2024-01-22 [1] CRAN (R 4.4.0)
 DBI                 1.2.2   2024-02-16 [1] CRAN (R 4.4.0)
 devtools            2.4.5   2022-10-11 [1] CRAN (R 4.4.0)
 digest              0.6.35  2024-03-11 [1] CRAN (R 4.4.0)
 dplyr             * 1.1.4   2023-11-17 [1] CRAN (R 4.4.0)
 e1071               1.7-14  2023-12-06 [1] CRAN (R 4.4.0)
 ellipsis            0.3.2   2021-04-29 [1] CRAN (R 4.4.0)
 evaluate            0.23    2023-11-01 [1] CRAN (R 4.4.0)
 fansi               1.0.6   2023-12-08 [1] CRAN (R 4.4.0)
 farver              2.1.2   2024-05-13 [1] CRAN (R 4.4.0)
 fastmap             1.1.1   2023-02-24 [1] CRAN (R 4.4.0)
 forcats           * 1.0.0   2023-01-29 [1] CRAN (R 4.4.0)
 fs                  1.6.4   2024-04-25 [1] CRAN (R 4.4.0)
 generics            0.1.3   2022-07-05 [1] CRAN (R 4.4.0)
 geosphere         * 1.5-18  2022-11-15 [1] CRAN (R 4.4.0)
 ggplot2           * 3.5.1   2024-04-23 [1] CRAN (R 4.4.0)
 glue                1.7.0   2024-01-09 [1] CRAN (R 4.4.0)
 gtable              0.3.5   2024-04-22 [1] CRAN (R 4.4.0)
 here              * 1.0.1   2020-12-13 [1] CRAN (R 4.4.0)
 highr               0.10    2022-12-22 [1] CRAN (R 4.4.0)
 hms                 1.1.3   2023-03-21 [1] CRAN (R 4.4.0)
 htmltools           0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
 htmlwidgets         1.6.4   2023-12-06 [1] CRAN (R 4.4.0)
 httpuv              1.6.15  2024-03-26 [1] CRAN (R 4.4.0)
 httr                1.4.7   2023-08-15 [1] CRAN (R 4.4.0)
 jsonlite            1.8.8   2023-12-04 [1] CRAN (R 4.4.0)
 kableExtra        * 1.4.0   2024-01-24 [1] CRAN (R 4.4.0)
 KernSmooth          2.23-26 2025-01-01 [4] CRAN (R 4.4.2)
 knitr             * 1.46    2024-04-06 [1] CRAN (R 4.4.0)
 labeling            0.4.3   2023-08-29 [1] CRAN (R 4.4.0)
 later               1.3.2   2023-12-06 [1] CRAN (R 4.4.0)
 latex2exp         * 0.9.6   2022-11-28 [1] CRAN (R 4.4.0)
 lattice           * 0.22-5  2023-10-24 [4] CRAN (R 4.3.1)
 lifecycle           1.0.4   2023-11-07 [1] CRAN (R 4.4.0)
 lubridate         * 1.9.3   2023-09-27 [1] CRAN (R 4.4.0)
 magrittr          * 2.0.3   2022-03-30 [1] CRAN (R 4.4.0)
 MASS                7.3-64  2025-01-04 [4] CRAN (R 4.4.2)
 Matrix              1.7-1   2024-10-18 [4] CRAN (R 4.4.1)
 memoise             2.0.1   2021-11-26 [1] CRAN (R 4.4.0)
 mgcv                1.9-1   2023-12-21 [4] CRAN (R 4.3.2)
 mime                0.12    2021-09-28 [1] CRAN (R 4.4.0)
 miniUI              0.1.1.1 2018-05-18 [1] CRAN (R 4.4.0)
 munsell             0.5.1   2024-04-01 [1] CRAN (R 4.4.0)
 nlme                3.1-166 2024-08-14 [4] CRAN (R 4.4.2)
 permute           * 0.9-7   2022-01-27 [1] CRAN (R 4.4.0)
 pillar              1.9.0   2023-03-22 [1] CRAN (R 4.4.0)
 pkgbuild            1.4.4   2024-03-17 [1] CRAN (R 4.4.0)
 pkgconfig           2.0.3   2019-09-22 [1] CRAN (R 4.4.0)
 pkgload             1.3.4   2024-01-16 [1] CRAN (R 4.4.0)
 profvis             0.3.8   2023-05-02 [1] CRAN (R 4.4.0)
 promises            1.3.0   2024-04-05 [1] CRAN (R 4.4.0)
 proxy               0.4-27  2022-06-09 [1] CRAN (R 4.4.0)
 purrr             * 1.0.2   2023-08-10 [1] CRAN (R 4.4.0)
 R6                  2.5.1   2021-08-19 [1] CRAN (R 4.4.0)
 ragg                1.3.0   2024-03-13 [1] CRAN (R 4.4.0)
 raster            * 3.6-26  2023-10-14 [1] CRAN (R 4.4.0)
 RColorBrewer      * 1.1-3   2022-04-03 [1] CRAN (R 4.4.0)
 Rcpp                1.0.12  2024-01-09 [1] CRAN (R 4.4.0)
 readr             * 2.1.5   2024-01-10 [1] CRAN (R 4.4.0)
 remotes             2.5.0   2024-03-17 [1] CRAN (R 4.4.0)
 rlang               1.1.4   2024-06-04 [1] CRAN (R 4.4.0)
 rmarkdown           2.26    2024-03-05 [1] CRAN (R 4.4.0)
 rnaturalearth     * 1.0.1   2023-12-15 [1] CRAN (R 4.4.0)
 rnaturalearthdata * 1.0.0   2024-02-09 [1] CRAN (R 4.4.0)
 rprojroot           2.0.4   2023-11-05 [1] CRAN (R 4.4.0)
 rstudioapi          0.16.0  2024-03-24 [1] CRAN (R 4.4.0)
 scales              1.3.0   2023-11-28 [1] CRAN (R 4.4.0)
 sessioninfo         1.2.2   2021-12-06 [1] CRAN (R 4.4.0)
 sf                * 1.0-16  2024-03-24 [1] CRAN (R 4.4.0)
 shiny               1.8.1.1 2024-04-02 [1] CRAN (R 4.4.0)
 sp                * 2.1-4   2024-04-30 [1] CRAN (R 4.4.0)
 stringi             1.8.4   2024-05-06 [1] CRAN (R 4.4.0)
 stringr           * 1.5.1   2023-11-14 [1] CRAN (R 4.4.0)
 svglite             2.1.3   2023-12-08 [1] CRAN (R 4.4.0)
 systemfonts         1.0.6   2024-03-07 [1] CRAN (R 4.4.0)
 terra               1.7-71  2024-01-31 [1] CRAN (R 4.4.0)
 textshaping         0.3.7   2023-10-09 [1] CRAN (R 4.4.0)
 tibble            * 3.2.1   2023-03-20 [1] CRAN (R 4.4.0)
 tidyr             * 1.3.1   2024-01-24 [1] CRAN (R 4.4.0)
 tidyselect          1.2.1   2024-03-11 [1] CRAN (R 4.4.0)
 tidyverse         * 2.0.0   2023-02-22 [1] CRAN (R 4.4.0)
 timechange          0.3.0   2024-01-18 [1] CRAN (R 4.4.0)
 tzdb                0.4.0   2023-05-12 [1] CRAN (R 4.4.0)
 units               0.8-5   2023-11-28 [1] CRAN (R 4.4.0)
 urlchecker          1.0.1   2021-11-30 [1] CRAN (R 4.4.0)
 usethis             2.2.3   2024-02-19 [1] CRAN (R 4.4.0)
 utf8                1.2.4   2023-10-22 [1] CRAN (R 4.4.0)
 vctrs               0.6.5   2023-12-01 [1] CRAN (R 4.4.0)
 vegan             * 2.6-4   2022-10-11 [1] CRAN (R 4.4.0)
 viridisLite         0.4.2   2023-05-02 [1] CRAN (R 4.4.0)
 withr               3.0.0   2024-01-16 [1] CRAN (R 4.4.0)
 xfun                0.43    2024-03-25 [1] CRAN (R 4.4.0)
 xml2                1.3.6   2023-12-04 [1] CRAN (R 4.4.0)
 xtable              1.8-4   2019-04-21 [1] CRAN (R 4.4.0)
 yaml                2.3.8   2023-12-11 [1] CRAN (R 4.4.0)

 [1] /home/juliette/R/x86_64-pc-linux-gnu-library/4.4
 [2] /usr/local/lib/R/site-library
 [3] /usr/lib/R/site-library
 [4] /usr/lib/R/library

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

References

Capblancq, Thibaut, and Brenna R Forester. 2021. “Redundancy Analysis: A Swiss Army Knife for Landscape Genomics.” Methods in Ecology and Evolution 12 (12): 2298–2309.